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We consider a version of large population games whose agents compete for resources using strate- 
gies with adaptable preferences. Diversity among the agents reduces their maladpative behavior. 
We find interesting scaling relations with diversity for the variance of decisions. When diversity 
increases, the scaling dynamics is modified by kinetic sampling and waiting mechanisms. 
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Many natural and artificial systems involve interacting 
agents, each making independent decisions to compete 
for limited resources, but globally exhibit coordinated be- 
havior through their mutual adaptation 0, ■ Examples 
include the competition of predators in ecology, buyers 
or sellers in economic markets, and routers in computer 
networks. While a standard approach is to analyse the 
steady state behavior of the system described by the Nash 
equilibria 3], it is interesting to consider the dynamics 
of how the steady state is approached. Dynamical stud- 
ies are especially relevant when one considers the effects 
of changing enviromcnt, such as those in economics or 
distributed control. 

The recently proposed Minority Games (MG) are pro- 
totypes of such multi-agent systems 0. The dynamical 
nature of the adaptive processes is revealed when the 
complexity of the agents is low, where the final states de- 
pend on the initial conditions [4|,|5j- Here, the system ex- 
hibits large fluctuations, which are caused by the initially 
zero preference of strategies for all agents. However, 
when the game is used to model economic systems, it 
is not realistic to expect that all agents enter the market 
with the same preference. Besides, in games which use 
public information only, this imply that different agents 
would maintain identical preferences of strategies at all 
subsequent steps, which is again unlikely. Furthermore, 
when the game is used to model distributed control in 
multi-agent systems, identical preferences of strategies of 
the agents lead to maladaptative behavior, which refers to 
the bursts of the population's decisions due to their pre- 
mature rush to certain state, compromising the system 
efficiency 0] ■ There were attempts of improvement by 
introducing thermal noise 0, biased starts 0,13, bias 
strategies and random initial conditions jfj- How- 
ever, no systematic studies have been made. 

In this Letter, we consider the effects of introducing 
randomness in the initial preferences of strategies among 
the agents, focusing on the regime of low complexity, 
where analyses assuming vanishing step sizes are not ap- 
plicable 0,13 • Concretely, we consider a population of N 
agents competing in an environment of limited resources, 
N being odd 0- Each agent makes a decision + or — 
at each time step, and the minority group wins. The de- 
cisions of each agent are prescribed by strategies, which 
are Boolean functions mapping the history of the win- 
ning bits in the most recent m steps to decisions + or 



— . Before the game starts, each agent randomly picks s 
strategies. Out of her s strategies, each agent makes de- 
cisions according to the most successful one at each step; 
the success of a strategy is measured by its virtual point, 
which increases (decreases) by 1 if it indicates a winning 
(losing) decision at a time step. 

In contrast to early versions of the game, the agents 
may enter the game with diverse preferences of their 
strategies. This is done by randomly assigning R vir- 
tual points to the s strategies of each agent before the 
game starts. Hence the initial virtual point of each strat- 
egy obeys a multinomial distribution with mean R/s and 
variance R(s — l)/s 2 . The ratio p = R/N is referred to 
as the diversity. In particular, for s = 2 and odd R, no 
two strategies have the same virtual points throughout 
the game, and the dynamics of the game is determinis- 
tic, resulting in highly precise simulation results useful 
for refined comparison with theories. 

To describe the macroscopic dynamics of the system, 
we define the D-dimensional vector A^(t), which is the 
sum of the decisions of all agents responding to history 
/j, of their strategies, normalized by N, where D = 2™ is 
the number of histories. While only one of the D com- 
ponents corresponds to the historical state p*(t) of the 
system, the augmentation to D components is necessary 
to describe the attractor structure and the transient be- 
havior of the system dynamics. The inset of Fig. 2] il- 
lustrates the convergence to the attractor for the visu- 
alizable case of to = 1. The dynamics proceeds in the 
direction which tends to reduce the magnitude of the 
components of A^(t) 4]. However, a certain amount of 
maladaptation always exists in the system, so that the 
components of A M (t) overshoot, resulting in periodic at- 
tractors with period of 2D. Every state p appears as his- 
torical states two times in a steady-state period, yielding 
the winning bits — and + each exactly once. One oc- 
curence brings from positive to negative, and another 
bringing it back from negative to positive, thus complet- 
ing a cycle. For to = 1, the steady state is described by 
the sequence fi(t) = —,+,+,—, where both states — and 
+ are followed by — and + once each. For general values 
of to, the states in an attractor are given by a binary de 
Brujin sequence of order m + 1 [T^ . 

As shown in Fig. ^ the variance a 2 /N of the popula- 
tion for decision + scales as a function of the complexity 
a = D /N, agreeing with previous observations . When 
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FIG. 1: The dependence of the variance of the population for 
decision + on the complexity for different diversities at s — 2 
averaged over 128 samples. The horizontal dotted line is the 
limit of random decisions. Inset: the state motion of a sample 
in the phase space for m = 1. Solid dots: attractor states. 



a is small, games with increasing complexity create time 
series of decreasing fluctuations. A phase transition takes 
place around a c « 0.3, after which it increases gradu- 
ally to the limit of random decisions, with a 2 /N = 0.25. 
When a < a c , we have the symmetric phase, in which 
the occurences of decisions 1 and responding to a given 
historical state p are equal, whereas in the asymmetric 
phase above ov, the occurences are biased for at least 
some history p (T^J- FigureHalso shows the data collapse 
for different N for p *~ 1, indicating that the variance is 
a function of p. It is observed that the variance decreases 
significantly with diversity in the symmetric phase, and 
remains unaffected in the asymmetric phase. 

The dependence of the variance on the diversity is fur- 
ther shown in Fig. [21 for given memory sizes m. Here we 
focus on the physical picture of the dynamics . Four 
regimes can be identified: 

(a) Multinomial regime. When p ~ iV _1 , <r 2 /N ~ N 
with proportionality constants dependent on m. To anal- 
yse this and other regimes, we let S a p(ui) be the num- 
ber of agents holding strategies a and (3 (with a < 
and the virtual point of strategy a is initially displaced 
by uj with respect to (3. The average of S a p(uj) over ini- 
tial condition is proportional to the binomial distribution 



of virtual points, i.e. 



(S aP (u)) = NC* R _ 
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The key to analysing the system dynamics is the obser- 
vation that the virtual points of a strategy displace by 
exactly the same amount for all agents. Hence for a given 
strategy pair, the profile of the virtual point distribution 
remains unchanged, but the peak position shifts with 
the game dynamics. If the virtual point displacement 
of strategy a at time t is fi a (t), then the agents holding 
strategies a and (3 make decisions according to strategy 
a if uj + il a (t) — Qpii) > 0, and strategy (3 otherwise. At 
time t, we can write fl a (t) 



(t)££, where fc„(t) 
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FIG. 2: The dependence of the variance of the population 
for decision + on the diversity at m = 1 and s = 2. Symbols: 
simulation results averaged over 1024 samples. Lines: theory. 
Dashed-dotted line: scaling prediction. Inset: Comparison 
between simulation results (symbols), theory with waiting ef- 
fects included (lines) and excluded (dashed lines). 



time t when the game responded to history p. Consider 
the difference A"(t) - A»(0) = £ £ a</3 S a p (w) - 
£g)[0(w + n a (t) — Qp(t)) — 0(ui)]. Its average can be 
found by introducing the average (S a p(oj)), writing the 
step function as a sum over Kronecka deltas and intro- 
ducing their integral representation, using the identity 



cos k6+i sin k9 cos 8 



-sin 



and noting that £ Q — 0. The final result is 



(A"(t) - A"(0)) = — / d9 cos 
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sinfc M (i)0 



is the number of wins minus losses of decision + up to 



x cos 

When p ~ N-\ (A»(t + 1) - A^(t)) ~ 0(1) and is 
self-averaging. Since A^(0) is Gaussian with variance 
iV -1 , the values of A^{t) at the attractors can be com- 
puted, and the variance found. For example, for m = 1, 
a 2 /N ee N{[A»*M(t) - (A"*W(i))] a >/4 = N[7(c R+1 ) 2 - 
2c R+1 c R+3 + 7(c R+3 ) 2 }, where c„ = 2~"C^ /2 for even 
integer n. 

(b) Scaling regime. When p ~ 1, a 2 /N ~ p" 1 with 
proportionality constants effectively independent of m 
for m not too large. In this case, Eq. can be sim- 
plified to (A»(t) - A»(0)) = k^(t)yf2/nR. The average 
step size becomes (A li (t+ 1) - A^(t)) = ~ 
0(A/~2) and is self-averaging. To interpret this result, 
we note that changes in A^(t) are only contributed by 
fickle agents with marginal preferences of their strate- 
gies. That is, those with uj + Q a (t) — ^Ipit) = ±1 and 
= T2sgnA^(t) for p = p*(t). For large R, the bi- 
nomial virtual point distribution among agents of a given 
strategy pair is effectively a Gaussian with variance R. 
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Hence the number of agents switching strategies at time 
t scales as the height of the Gaussian distribution, which 
is ^/2/irR. Thus, by spreading the virtual point distribu- 
tion, diversity reduces the step size and hence maladap- 
tation. 

As a result, each state of the attr actor i s confined in 
a D-dimensional hypercube of size y / 2/7rP, irrespective 
of the initial position of the A M components. Starting 
from the initial state A M (0), the state changes in steps 
of size y/2/irR until it reaches the attractor, whose 2D 
historical states are given by ^2/^R\ y ^R/2A^{0)'] and 
y/2 /ttR{ \^ttR/2A^ (0)1 - 1}, where \x] represents the 
decimal part of x. Averaging over A^(Q), which are 
Gaussian numbers with mean and variance l/N, the 
variance of decisions become a 2 /N — f(p)/2Trp, where 
f(p) approaches (1 - l/4D)/3 for p > 1. Note that 
f(p) is a smooth function of p, since a 2 /N depends on p 
mainly through the step size factor l/2iip, whereas f(p) 
merely provides a higher order correction to the func- 
tional dependence. This accounts for the scaling regime 
in Fig. |21 Furthermore, we note that f(p) rapidly ap- 
proaches 1/3 when m increases. Hence for general values 
of m, a 2 /N — > l/67rp, provided that m is not too large. 

(c) Kinetic sampling regime. When p ~ N, <r 2 /N 
deviates above the scaling with p^ 1 , and is given by 
a 2 /N = f m (A)/N, where A = ^2N/irp is the kinetic 
step size, and f m is a function dependent on the mem- 
ory size m. Here A^(t + 1) — A^(t) scales as iV _1 and 
is no longer self-averaging. Rather, it is equal to 2/N 
times the number of agents who switch strategies at time 
t, which is Poisson distributed with a mean of ^2j^R. 
However, since the attractor is formed by steps which 
reverse the sign of A^, the average step size in the at- 
tractor is larger than that in the transient state. To see 
this, we consider the probability of P att (AA) of step sizes 
AA in the attractor. Assuming that all states of the 
phase space are equally likely to be accessed, we have 
Patt(AA) - Ea ^att(AA, A), where P att (AA, A) is the 
probability of finding the position A with displacement 
AA in the attractor. Consider the example of m = 1 in 
the inset of Fig. 2] The sign reversal condition implies 
that P att (AA,A) = Ppoi(AA) fj p 8[—A^(A^ + AA 1 *)], 
where Pp i(AA) is the Poisson distribution of step 
sizes, yielding P att (AA) = P Poi (AA) JJ^ AA^. Thus 
the attractor averages ((AA ± ) 2 ) at t, which are required 
for computing the variance of decisions, are given by 
((AA ± ) 2 AA+AA-)p oi /(AA+AA-) Poi . In other words, 
the sampling of the step sizes is weighted by the attrac- 
tor sizes due to the kinetics. The result for m = 1 is 
a 2 /N = (14A 3 + 105A 2 + 132A + 24)/967V(2A -I- 1). 

(d) Waiting regime. When p N, a 2 /N deviates 
above the predictions of kinetic sampling. Here the 
agents are so diverse that the average step size is ap- 
proaching 0. At each state in the phase space, the sys- 
tem remains stationary for many time steps, waiting for 
some agent to reduce the magnitude of her virtual point 
until strategy switching can take place. This waiting ef- 



fect modifies the composition of the group of fickle agents 
who contribute to the state transitions, and consequently 
increase the step sizes and variance above those predicted 
by kinetic sampling. Consider the example of m = 1. As 
shown in the inset of Fig. ^ the attractor consists of both 
vertical and horizontal hops, and detailed analysis shows 
that only one type of agents can complete both hops. 
Since fewer and fewer agents contribute to the switching 
of states in the limit p N, a single agent of this type 
will dominate the game dynamics, and one would expect 
that a 2 /N approaches 1/4 AT. However, when waiting is 
possible, agents not of this correct type can wait for other 
agents to complete the hops in the attractor, even though 
one would expect that the probability of finding more 
than one fickle agents is drastically less than that for 
one. In fact, analysis shows that the attractor consists of 
a single fickle agent with a probability of 1/11 only, and 
a 2 /N approaches 9/22N rather than 1/4N. As shown 
in the inset of Fig. [21 lengthy analytic results including 
waiting effects significantly improve the agreement with 
simulations over the kinetic sampling prediction. 

Many properties of the system dependent on the tran- 
sient dynamics also depend on its diversity. For example, 
since diversity reduces the fraction of agents switching 
strategies at each time step, it also slows down the con- 
vergence to the steady state. Hence in the scaling regime, 
the convergence time scales as p 1 ^ 2 . Specifically, when 
p ^> 1, the average convergence time becomes (2 + v / 2) v / p 
for m = 1. Similarly, the distribution of payoffs among 
the frozen agents (that is, agents who do not switch their 
strategies at the steady state) also depends on the tran- 
sient. Since the system dynamics reaches a periodic at- 
tractor, they have constant average payoffs at the steady 
state. Hence any spread in their payoff distribution is a 
consequence of the transient dynamics. Thus, in the scal- 
ing regime, the mean square payoff scales as p. Specifi- 
cally, when p> 1, the mean square payoff becomes 7rp for 
m = 1. Simulation results of both the convergence time 
and the mean square payoff have an excellent agreement 
with the theory |l4j . 

The results presented here can be generalized to other 
cases. Consider the exogenous MG, in which the informa- 
tion p,(t) was randomly and independently drawn at each 
time step t . The picture that the states of the game 
are hopping between hypercubes in the phase space re- 
mains valid. At the steady state, the attractor consists of 
hoppings among all vertices of a hyperpolygon enclosing 
the origin in the phase space, analogous to the present 
endogenous case, in which a fraction of hyperpolygon ver- 
tices belong to the attractor. In the scaling regime, the 
behavior depends on the scaling of the step sizes with 
diversity, rather than the actual sequence of the steps. 
Consequently, the behavior is similar to that of the en- 
dogenous game. 

The present results can be extended to higher values 
of m [lj|. For m = 2, analysis using the de Bruijn se- 
quence explicitly yields excellent results. For higher m, 
we approximate the attractor of the exogenous game by 
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a hyperpolygon enclosing the origin of the phase space. 
Using a generating function approach, and taking into 
account the scaling of step sizes and kinetic sampling, 
the computed variance of decisions agrees qualitatively 
with simulations, except for values of a close to a c . 

We can also make qualitative predictions about the 
transition from the symmetric to asymmetric phase when 
the complexity a increases |13| . From Eq. Q), the aver- 
age displacement in the phase space is given by 

ww-*m~w) ^ {R+ l D{k2)y (2) 

where (fc 2 ) represents the mean of fc„(i) 2 for all v < D. 
For p ~ a ~ 1, it can be verified that A^(t) - A"(0) 
is self-averaging. Suppose the game dynamics leads to 
an attractor near the origin, with (A^(t)) — > 0. Noting 
that (^(O) 2 ) ~ l/N, we obtain the self-consistent rela- 
tion (fc 2 ) = p/2(a c - a), where a c = 1/tt « 0.318. This 
means that when a approaches a c , the average step size 
appraoches in the asymptotic limit. There is a critical 
slow down since the convergence time diverges. When 
a exceeds a c , the average step size vanishes before the 
system reaches the attractor near the origin, so that the 
state of the system is trapped at locations with at least 
some components being nonzero. The interpretation is 
that when a is large, the distributions of strategies be- 
come so sparse that motions in the phase space cannot 
be achieved by the switching of strategies. Note that the 
value of a c is close to the value of 0.337 obtained by the 
continuum approximation £| or batch update 

From the perspective of game theory, it is natural to 
consider whether the introduction of diversity assists the 
game to reach a Nash equilibrium. It has been veri- 
fied that Nash equilibria consist of pure strategies Q. 
Hence all frozen agents have no incentives to switch their 
strategies. In fact, since the dynamics in the attractor is 
periodic, the payoffs of all strategies become zero when 
averaged over a period. Thus, the Nash equilibrium is 
approached in the sense that the fraction of fickle agents 
decreases with increasing diversity. In the extremely 



diverse limit, it is probable that only one fickle agent 
switches strategy at each step in the attractor. In this 
case, even the fickle agent cannot increase her payoff, 
since on switching she always remains on the majority 
side and loses. Then a Nash equilibrium is reached ex- 
actly. For m = 1, for example, a Nash equilibrium is 
reached in this way with probability 7/11. 

In summary, we have studied the effects of diversity in 
the initial preference of strategies on a game with adap- 
tive agents competing for finite resources. Scaling of step 
sizes accounts for the behavior of the variance of deci- 
sions in the scaling regime (p ~ 1). At high diversity, 
we find that the scaling mechanism is supplemented by 
kinetic sampling, a mechanism self-imposed by the re- 
quirement to stay in the attractor. In extremely diverse 
systems, we discover further a waiting mechanism, when 
agents who are unable to complete the attractor dynam- 
ics alone wait for other agents to collaborate with them. 
Together, these mechanisms yield theoretical predictions 
with excellent agreement with simulations over 9 decades 
of data. By introducing diversity the variance of deci- 
sions in the symmetric phase decreases, showing that the 
maladaptive behavior is reduced. 

The combination of scaling, kinetic sampling and wait- 
ing in accounting for the steady state properties of the 
system illustrates the importance of dynamical consid- 
erations in describing the system behavior. We antici- 
pate that these dynamical effects will play a crucial role 
in explaining the system behavior in the entire symmet- 
ric phase, since when a increases, the state motion in a 
high dimensional phase space can easily shift the tail of 
the virtual point distributions to the verge of strategy 
switching, leading to the sparseness condition where ki- 
netic sampling and waiting effects are relevant. Due to 
the generic nature of these effects, we expect that they are 
relevant to minority games with different payoff functions 
and updating rules, as well as other multi-agent systems. 
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